{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hGLYrUwBmvUD"
   },
   "source": [
    "<a target=\"_blank\" href=\"https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/tutorials/adalflow_dataclasses.ipynb\">\n",
    "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
    "</a>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gHK6HFngl6iP"
   },
   "source": [
    "# 🤗 Welcome to AdalFlow!\n",
    "## The library to build & auto-optimize any LLM task pipelines\n",
    "\n",
    "Thanks for trying us out, we're here to provide you with the best LLM application development experience you can dream of 😊 any questions or concerns you may have, [come talk to us on discord,](https://discord.gg/ezzszrRZvT) we're always here to help! ⭐ <i>Star us on <a href=\"https://github.com/SylphAI-Inc/AdalFlow\">Github</a> </i> ⭐\n",
    "\n",
    "\n",
    "# Quick Links\n",
    "\n",
    "Github repo: https://github.com/SylphAI-Inc/AdalFlow\n",
    "\n",
    "Full Tutorials: https://adalflow.sylph.ai/index.html#.\n",
    "\n",
    "Deep dive on each API: check out the [developer notes](https://adalflow.sylph.ai/tutorials/index.html).\n",
    "\n",
    "Common use cases along with the auto-optimization:  check out [Use cases](https://adalflow.sylph.ai/use_cases/index.html).\n",
    "\n",
    "# Author\n",
    "\n",
    "This notebook was created by community contributor [Ajith](https://github.com/ajithvcoder).\n",
    "\n",
    "# Outline\n",
    "\n",
    "This is a quick introduction of what AdalFlow is capable of. We will cover:\n",
    "\n",
    "* How to use `DataClass` with `DataClassParser`.\n",
    "* How to do nested dataclass, we will test both one and two levels of nesting.\n",
    "\n",
    "**Next: Try our [auto-optimization](https://colab.research.google.com/drive/1n3mHUWekTEYHiBdYBTw43TKlPN41A9za?usp=sharing)**\n",
    "\n",
    "\n",
    "# Installation\n",
    "\n",
    "1. Use `pip` to install the `adalflow` Python package. We will need `openai` and `groq`from the extra packages.\n",
    "\n",
    "  ```bash\n",
    "  pip install adalflow[openai,groq]\n",
    "  ```\n",
    "2. Setup  `openai` and `groq` API key in the environment variables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nqe-vxB1BCux"
   },
   "source": [
    "### Install adalflow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ZaaevxNH9JMQ"
   },
   "outputs": [],
   "source": [
    "# Install adalflow with necessary dependencies\n",
    "from IPython.display import clear_output\n",
    "\n",
    "!pip install -U adalflow[openai,groq]\n",
    "\n",
    "clear_output()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip uninstall httpx anyio -y\n",
    "!pip install \"anyio>=3.1.0,<4.0\"\n",
    "!pip install httpx==0.24.1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NGE70aZ8BLuf"
   },
   "source": [
    "### Set Environment Variables\n",
    "\n",
    "Note: Enter your api keys in below cell"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "API keys have been set.\n"
     ]
    }
   ],
   "source": [
    "#  or more securely\n",
    "\n",
    "import os\n",
    "\n",
    "from getpass import getpass\n",
    "\n",
    "# Prompt user to enter their API keys securely\n",
    "groq_api_key = getpass(\"Please enter your GROQ API key: \")\n",
    "openai_api_key = getpass(\"Please enter your OpenAI API key: \")\n",
    "\n",
    "\n",
    "# Set environment variables\n",
    "os.environ[\"GROQ_API_KEY\"] = groq_api_key\n",
    "os.environ[\"OPENAI_API_KEY\"] = openai_api_key\n",
    "\n",
    "print(\"API keys have been set.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZxBkm77uBZpl"
   },
   "source": [
    "### Import necessary libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "id": "wOAiKg899Z2u"
   },
   "outputs": [],
   "source": [
    "# Import required libraries\n",
    "from dataclasses import dataclass, field\n",
    "from typing import List, Dict\n",
    "import adalflow as adal\n",
    "from adalflow.components.model_client import GroqAPIClient\n",
    "from adalflow.utils import setup_env"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'1.0.0.beta.3'"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "adal.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "MBW5viOG9hM8"
   },
   "source": [
    "### Basic Vannila Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "id": "YA4pAIek9ewc"
   },
   "outputs": [],
   "source": [
    "# Define the output structure using dataclass\n",
    "@dataclass\n",
    "class BasicQAOutput(adal.DataClass):\n",
    "    explanation: str = field(\n",
    "        metadata={\"desc\": \"A brief explanation of the concept in one sentence.\"}\n",
    "    )\n",
    "    example: str = field(metadata={\"desc\": \"An example of the concept in a sentence.\"})\n",
    "\n",
    "    # Control output fields order\n",
    "    __output_fields__ = [\"explanation\", \"example\"]\n",
    "\n",
    "\n",
    "# Define the template using jinja2 syntax\n",
    "qa_template = r\"\"\"<SYS>\n",
    "You are a helpful assistant.\n",
    "<OUTPUT_FORMAT>\n",
    "{{output_format_str}}\n",
    "</OUTPUT_FORMAT>\n",
    "</SYS>\n",
    "<USER> {{input_str}} </USER>\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "id": "x4__jnbP9luN"
   },
   "outputs": [],
   "source": [
    "# Define the QA component\n",
    "class QA(adal.Component):\n",
    "    def __init__(self, model_client: adal.ModelClient, model_kwargs: Dict):\n",
    "        super().__init__()\n",
    "\n",
    "        # Initialize the parser with the output dataclass\n",
    "        parser = adal.DataClassParser(data_class=BasicQAOutput, return_data_class=True)\n",
    "\n",
    "        # Set up the generator with model, template, and parser\n",
    "        self.generator = adal.Generator(\n",
    "            model_client=model_client,\n",
    "            model_kwargs=model_kwargs,\n",
    "            template=qa_template,\n",
    "            prompt_kwargs={\"output_format_str\": parser.get_output_format_str()},\n",
    "            output_processors=parser,\n",
    "        )\n",
    "\n",
    "    def call(self, query: str):\n",
    "        \"\"\"Synchronous call to generate response\"\"\"\n",
    "        return self.generator.call({\"input_str\": query})\n",
    "\n",
    "    async def acall(self, query: str):\n",
    "        \"\"\"Asynchronous call to generate response\"\"\"\n",
    "        return await self.generator.acall({\"input_str\": query})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "id": "TVi3rGvs9nte"
   },
   "outputs": [],
   "source": [
    "# Example usage\n",
    "def run_basic_example():\n",
    "    # Instantiate the QA class with Groq model\n",
    "    qa = QA(\n",
    "        model_client=GroqAPIClient(),\n",
    "        model_kwargs={\"model\": \"llama3-8b-8192\"},\n",
    "    )\n",
    "\n",
    "    # Print the QA instance details\n",
    "    print(qa)\n",
    "\n",
    "    # Test the QA system\n",
    "    response = qa(\"What is LLM?\")\n",
    "    print(\"\\nResponse:\")\n",
    "    print(response)\n",
    "    print(f\"BasicQAOutput: {response.data}\")\n",
    "    print(f\"Explanation: {response.data.explanation}\")\n",
    "    print(f\"Example: {response.data.example}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "QA(\n",
      "  (generator): Generator(\n",
      "    model_kwargs={'model': 'llama3-8b-8192'}, trainable_prompt_kwargs=[], prompt=template: <SYS>\n",
      "    You are a helpful assistant.\n",
      "    <OUTPUT_FORMAT>\n",
      "    {{output_format_str}}\n",
      "    </OUTPUT_FORMAT>\n",
      "    </SYS>\n",
      "    <USER> {{input_str}} </USER>, prompt_kwargs: {'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:\\n```\\n{\\n    \"explanation\": \"A brief explanation of the concept in one sentence. (str) (required)\",\\n    \"example\": \"An example of the concept in a sentence. (str) (required)\"\\n}\\n```\\n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\\n-Use double quotes for the keys and string values.\\n-DO NOT mistaken the \"properties\" and \"type\" in the schema as the actual fields in the JSON output.\\n-Follow the JSON formatting conventions.'}, prompt_variables: ['output_format_str', 'input_str']\n",
      "    (prompt): template: <SYS>\n",
      "    You are a helpful assistant.\n",
      "    <OUTPUT_FORMAT>\n",
      "    {{output_format_str}}\n",
      "    </OUTPUT_FORMAT>\n",
      "    </SYS>\n",
      "    <USER> {{input_str}} </USER>, prompt_kwargs: {'output_format_str': 'Your output should be formatted as a standard JSON instance with the following schema:\\n```\\n{\\n    \"explanation\": \"A brief explanation of the concept in one sentence. (str) (required)\",\\n    \"example\": \"An example of the concept in a sentence. (str) (required)\"\\n}\\n```\\n-Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\\n-Use double quotes for the keys and string values.\\n-DO NOT mistaken the \"properties\" and \"type\" in the schema as the actual fields in the JSON output.\\n-Follow the JSON formatting conventions.'}, prompt_variables: ['output_format_str', 'input_str']\n",
      "    (model_client): GroqAPIClient()\n",
      "    (output_processors): DataClassParser(\n",
      "      data_class=BasicQAOutput, format_type=json,            return_data_class=True, input_fields=[],            output_fields=['explanation', 'example']\n",
      "      (_output_processor): JsonParser()\n",
      "      (output_format_prompt): template: Your output should be formatted as a standard JSON instance with the following schema:\n",
      "      ```\n",
      "      {{schema}}\n",
      "      ```\n",
      "      -Make sure to always enclose the JSON output in triple backticks (```). Please do not add anything other than valid JSON output!\n",
      "      -Use double quotes for the keys and string values.\n",
      "      -DO NOT mistaken the \"properties\" and \"type\" in the schema as the actual fields in the JSON output.\n",
      "      -Follow the JSON formatting conventions., prompt_variables: ['schema']\n",
      "    )\n",
      "  )\n",
      ")\n",
      "\n",
      "Response:\n",
      "GeneratorOutput(id=None, data=BasicQAOutput(explanation='LLM stands for Large Language Model, a type of artificial intelligence trained on vast amounts of text data to generate human-like language.', example='For instance, an LLM can be used to write articles, respond to customer queries, or even create entire books.'), error=None, usage=CompletionUsage(completion_tokens=66, prompt_tokens=174, total_tokens=240), raw_response='```\\n{\\n    \"explanation\": \"LLM stands for Large Language Model, a type of artificial intelligence trained on vast amounts of text data to generate human-like language.\",\\n    \"example\": \"For instance, an LLM can be used to write articles, respond to customer queries, or even create entire books.\"\\n}', metadata=None)\n",
      "BasicQAOutput: BasicQAOutput(explanation='LLM stands for Large Language Model, a type of artificial intelligence trained on vast amounts of text data to generate human-like language.', example='For instance, an LLM can be used to write articles, respond to customer queries, or even create entire books.')\n",
      "Explanation: LLM stands for Large Language Model, a type of artificial intelligence trained on vast amounts of text data to generate human-like language.\n",
      "Example: For instance, an LLM can be used to write articles, respond to customer queries, or even create entire books.\n"
     ]
    }
   ],
   "source": [
    "run_basic_example()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1n7edLQ19ql8"
   },
   "source": [
    "### Example 1 - Movie analysis data class"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "id": "5Arp4-Dq9u49"
   },
   "outputs": [],
   "source": [
    "# 1. Basic DataClass with different field types\n",
    "@dataclass\n",
    "class MovieReview(adal.DataClass):\n",
    "    title: str = field(metadata={\"desc\": \"The title of the movie\"})\n",
    "    rating: float = field(\n",
    "        metadata={\"desc\": \"Rating from 1.0 to 10.0\", \"min\": 1.0, \"max\": 10.0}\n",
    "    )\n",
    "    pros: List[str] = field(\n",
    "        default_factory=list,\n",
    "        metadata={\"desc\": \"List of positive points about the movie\"},\n",
    "    )\n",
    "    cons: List[str] = field(\n",
    "        default_factory=list,\n",
    "        metadata={\"desc\": \"List of negative points about the movie\"},\n",
    "    )\n",
    "\n",
    "    __output_fields__ = [\"title\", \"rating\", \"pros\", \"cons\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "id": "VLbRUzXg9yP0"
   },
   "outputs": [],
   "source": [
    "@dataclass\n",
    "class Actor(adal.DataClass):\n",
    "    name: str = field(metadata={\"desc\": \"Actor's full name\"})\n",
    "    role: str = field(metadata={\"desc\": \"Character name in the movie\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "id": "7MUcu0tk91l4"
   },
   "outputs": [],
   "source": [
    "# 2. Nested DataClass example\n",
    "\n",
    "# Have both MovieReview and Actor nested in DetailedMovieReview\n",
    "\n",
    "\n",
    "@dataclass\n",
    "class DetailedMovieReview(adal.DataClass):\n",
    "    basic_review: MovieReview\n",
    "    cast: List[Actor] = field(\n",
    "        default_factory=list, metadata={\"desc\": \"List of main actors in the movie\"}\n",
    "    )\n",
    "    genre: List[str] = field(\n",
    "        default_factory=list, metadata={\"desc\": \"List of genres for the movie\"}\n",
    "    )\n",
    "    recommend: bool = field(\n",
    "        default_factory=str, metadata={\"desc\": \"Whether you would recommend this movie\"}\n",
    "    )\n",
    "\n",
    "    __output_fields__ = [\"basic_review\", \"cast\", \"genre\", \"recommend\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example template for movie review\n",
    "movie_review_template = r\"\"\"<SYS>\n",
    "You are a professional movie critic. Analyze the given movie and provide a detailed review.\n",
    "<OUTPUT_FORMAT>\n",
    "{{output_format_str}}\n",
    "</OUTPUT_FORMAT>\n",
    "</SYS>\n",
    "<USER> Review this movie: {{movie_title}} </USER>\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the MovieReviewer component with MovieAnalysis data class\n",
    "class MovieReviewer(adal.Component):\n",
    "    def __init__(\n",
    "        self,\n",
    "        model_client: adal.ModelClient,\n",
    "        model_kwargs: Dict,\n",
    "        data_class: adal.DataClass,\n",
    "    ):\n",
    "        super().__init__()\n",
    "        self.additional_structure_prompt = (\n",
    "            \"Dont use 'type' and 'properties' in output directly give as dict\"\n",
    "        )\n",
    "        parser = adal.DataClassParser(data_class=data_class, return_data_class=True)\n",
    "        self.generator = adal.Generator(\n",
    "            model_client=model_client,\n",
    "            model_kwargs=model_kwargs,\n",
    "            template=movie_review_template,\n",
    "            prompt_kwargs={\n",
    "                \"output_format_str\": parser.get_output_format_str()\n",
    "                + self.additional_structure_prompt\n",
    "            },\n",
    "            output_processors=parser,\n",
    "        )\n",
    "\n",
    "    def call(self, movie_title: str):\n",
    "        return self.generator.call({\"movie_title\": movie_title})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DetailedMovieReview: DetailedMovieReview(basic_review=MovieReview(title='The Matrix', rating=8.7, pros=['Groundbreaking special effects and innovative action sequences', 'Thought-provoking and well-written storyline with complex themes', 'Strong performances from the cast, particularly Keanu Reeves and Laurence Fishburne'], cons=['Some viewers may find the pacing or character development uneven', \"The film's reliance on technology and virtual reality may be overwhelming for some\"]), cast=[Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity')], genre=['Science Fiction', 'Action'], recommend=True)\n",
      "BasicReview: MovieReview(title='The Matrix', rating=8.7, pros=['Groundbreaking special effects and innovative action sequences', 'Thought-provoking and well-written storyline with complex themes', 'Strong performances from the cast, particularly Keanu Reeves and Laurence Fishburne'], cons=['Some viewers may find the pacing or character development uneven', \"The film's reliance on technology and virtual reality may be overwhelming for some\"])\n",
      "Cast: [Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity')]\n"
     ]
    }
   ],
   "source": [
    "# test the data class with one level of nesting\n",
    "\n",
    "reviewer = MovieReviewer(\n",
    "    model_client=GroqAPIClient(),\n",
    "    model_kwargs={\"model\": \"llama3-8b-8192\"},\n",
    "    data_class=DetailedMovieReview,\n",
    ")\n",
    "\n",
    "response = reviewer(\"The Matrix\")\n",
    "print(f\"DetailedMovieReview: {response.data}\")\n",
    "print(f\"BasicReview: {response.data.basic_review}\")\n",
    "print(f\"Cast: {response.data.cast}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DetailedMovieReview: DetailedMovieReview(basic_review=MovieReview(title='The Matrix', rating=9.0, pros=['Innovative and groundbreaking special effects', 'Compelling and thought-provoking storyline', 'Strong performances by the lead actors', 'Excellent blending of action and philosophical themes'], cons=['Complex storyline may be confusing to some viewers', 'Released special effects may appear dated by modern standards']), cast=[Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity'), Actor(name='Hugo Weaving', role='Agent Smith')], genre=['Action', 'Science Fiction', 'Thriller'], recommend=True)\n",
      "BasicReview: MovieReview(title='The Matrix', rating=9.0, pros=['Innovative and groundbreaking special effects', 'Compelling and thought-provoking storyline', 'Strong performances by the lead actors', 'Excellent blending of action and philosophical themes'], cons=['Complex storyline may be confusing to some viewers', 'Released special effects may appear dated by modern standards'])\n",
      "Cast: [Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity'), Actor(name='Hugo Weaving', role='Agent Smith')]\n"
     ]
    }
   ],
   "source": [
    "# try use openai model\n",
    "reviewer = MovieReviewer(\n",
    "    model_client=adal.OpenAIClient(),\n",
    "    model_kwargs={\"model\": \"gpt-4o\"},\n",
    "    data_class=DetailedMovieReview,\n",
    ")\n",
    "response = reviewer(\"The Matrix\")\n",
    "print(f\"DetailedMovieReview: {response.data}\")\n",
    "print(f\"BasicReview: {response.data.basic_review}\")\n",
    "print(f\"Cast: {response.data.cast}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see both models can handle one level of nested dataclass quite well. And the output ordering will follow the ordering specified in __output_fields__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "id": "ekr4v8Xg93en"
   },
   "outputs": [],
   "source": [
    "# 3. second level nested dataclass\n",
    "\n",
    "\n",
    "@dataclass\n",
    "class MovieAnalysis(adal.DataClass):\n",
    "    review: DetailedMovieReview\n",
    "    box_office: float = field(\n",
    "        default=None, metadata={\"desc\": \"Box office earnings in millions of dollars\"}\n",
    "    )\n",
    "    awards: Dict[str, int] = field(\n",
    "        default=None,\n",
    "        metadata={\"desc\": \"Dictionary of award categories and number of wins\"},\n",
    "    )\n",
    "\n",
    "    __output_fields__ = [\"review\", \"box_office\", \"awards\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Error at parsing output: 'type'\n",
      "Error processing the output processors: Error: 'type'\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MovieAnalysis: None\n"
     ]
    },
    {
     "ename": "AttributeError",
     "evalue": "'NoneType' object has no attribute 'review'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[20], line 13\u001b[0m\n\u001b[1;32m     11\u001b[0m response \u001b[38;5;241m=\u001b[39m analysis(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mThe Matrix\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m     12\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMovieAnalysis: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mresponse\u001b[38;5;241m.\u001b[39mdata\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m---> 13\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mDetailedMovieReview: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[43mresponse\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mdata\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mreview\u001b[49m\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m     14\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mBasicReview: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mresponse\u001b[38;5;241m.\u001b[39mdata\u001b[38;5;241m.\u001b[39mreview\u001b[38;5;241m.\u001b[39mbasic_review\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m     15\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCast: \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mresponse\u001b[38;5;241m.\u001b[39mdata\u001b[38;5;241m.\u001b[39mreview\u001b[38;5;241m.\u001b[39mcast\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m)\n",
      "\u001b[0;31mAttributeError\u001b[0m: 'NoneType' object has no attribute 'review'"
     ]
    }
   ],
   "source": [
    "# test the data class with two levels of nested dataclass\n",
    "\n",
    "# gpt-3.5-turbo model\n",
    "\n",
    "analysis = MovieReviewer(\n",
    "    model_client=adal.OpenAIClient(),\n",
    "    model_kwargs={\"model\": \"gpt-3.5-turbo\"},\n",
    "    data_class=MovieAnalysis,\n",
    ")\n",
    "\n",
    "response = analysis(\"The Matrix\")\n",
    "print(f\"MovieAnalysis: {response.data}\")\n",
    "print(f\"DetailedMovieReview: {response.data.review}\")\n",
    "print(f\"BasicReview: {response.data.review.basic_review}\")\n",
    "print(f\"Cast: {response.data.review.cast}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MovieAnalysis: MovieAnalysis(review=DetailedMovieReview(basic_review=MovieReview(title='The Matrix', rating=9.5, pros=['Groundbreaking special effects', 'Thought-provoking themes', 'Innovative storyline', 'Strong performances from the cast'], cons=['Somewhat slow pacing in parts']), cast=[Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity')], genre=['Science Fiction', 'Action', 'Adventure'], recommend=True), box_office=463.5, awards={'Academy Awards': 4, 'MTV Movie Awards': 10, 'Saturn Awards': 7})\n",
      "DetailedMovieReview: DetailedMovieReview(basic_review=MovieReview(title='The Matrix', rating=9.5, pros=['Groundbreaking special effects', 'Thought-provoking themes', 'Innovative storyline', 'Strong performances from the cast'], cons=['Somewhat slow pacing in parts']), cast=[Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity')], genre=['Science Fiction', 'Action', 'Adventure'], recommend=True)\n",
      "BasicReview: MovieReview(title='The Matrix', rating=9.5, pros=['Groundbreaking special effects', 'Thought-provoking themes', 'Innovative storyline', 'Strong performances from the cast'], cons=['Somewhat slow pacing in parts'])\n",
      "Cast: [Actor(name='Keanu Reeves', role='Neo'), Actor(name='Laurence Fishburne', role='Morpheus'), Actor(name='Carrie-Anne Moss', role='Trinity')]\n"
     ]
    }
   ],
   "source": [
    "# test the data class with two levels of nested dataclass\n",
    "\n",
    "analysis = MovieReviewer(\n",
    "    model_client=GroqAPIClient(),\n",
    "    model_kwargs={\"model\": \"llama3-8b-8192\"},\n",
    "    data_class=MovieAnalysis,\n",
    ")\n",
    "\n",
    "response = analysis(\"The Matrix\")\n",
    "print(f\"MovieAnalysis: {response.data}\")\n",
    "print(f\"DetailedMovieReview: {response.data.review}\")\n",
    "print(f\"BasicReview: {response.data.review.basic_review}\")\n",
    "print(f\"Cast: {response.data.review.cast}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pSTrf8_t-DCx"
   },
   "source": [
    "### Example 2: Song Review\n",
    "Note: Song Review is modified by keeping Example 1 - Movie Review as a reference so that we would know how to use DataClasses for similar purposes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "id": "7g9bUa0q-B6Y"
   },
   "outputs": [],
   "source": [
    "# 1. Basic DataClass with different field types\n",
    "@dataclass\n",
    "class SongReview(adal.DataClass):\n",
    "    title: str = field(metadata={\"desc\": \"The title of the song\"})\n",
    "    album: str = field(metadata={\"desc\": \"The album of the song\"})\n",
    "    ranking: int = field(\n",
    "        metadata={\"desc\": \"Billboard peak ranking from 1 to 200\", \"min\": 1, \"max\": 200}\n",
    "    )\n",
    "    streaming: Dict[str, int] = field(\n",
    "        default_factory=list,\n",
    "        metadata={\n",
    "            \"desc\": \"Dict of lastest approximate streaming count in spotify and in youtube. Gives the count in millions\"\n",
    "        },\n",
    "    )\n",
    "    pros: List[str] = field(\n",
    "        default_factory=list,\n",
    "        metadata={\"desc\": \"List of positive points about the song\"},\n",
    "    )\n",
    "    cons: List[str] = field(\n",
    "        default_factory=list,\n",
    "        metadata={\"desc\": \"List of negative points about the song\"},\n",
    "    )\n",
    "\n",
    "    __output_fields__ = [\"title\", \"rating\", \"streaming\", \"pros\", \"cons\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "id": "UGhMRZht-HiB"
   },
   "outputs": [],
   "source": [
    "@dataclass\n",
    "class Artist(adal.DataClass):\n",
    "    name: str = field(metadata={\"desc\": \"Artist's full name\"})\n",
    "    role: str = field(metadata={\"desc\": \"Artist's role in the song\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "id": "sfNWgPYN-JAj"
   },
   "outputs": [],
   "source": [
    "# 2. Nested DataClass example\n",
    "\n",
    "\n",
    "@dataclass\n",
    "class DetailedSongReview(adal.DataClass):\n",
    "    basic_review: SongReview = field(\n",
    "        default=SongReview, metadata={\"desc\": \"basic Song review details\"}\n",
    "    )\n",
    "    cast: List[Artist] = field(\n",
    "        default_factory=list,\n",
    "        metadata={\"desc\": \"List of main singer, lyrisist and musicians in the song\"},\n",
    "    )\n",
    "    genre: List[str] = field(\n",
    "        default_factory=list, metadata={\"desc\": \"List of genres for the song\"}\n",
    "    )\n",
    "    recommend: bool = field(\n",
    "        default_factory=str, metadata={\"desc\": \"Whether you would recommend this song\"}\n",
    "    )\n",
    "\n",
    "    __output_fields__ = [\"basic_review\", \"cast\", \"genre\", \"recommend\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "id": "HG8rtCd8-K7t"
   },
   "outputs": [],
   "source": [
    "# 3. two levels of nesting dataclass\n",
    "\n",
    "# all these fields as we use default, it is optional, so\n",
    "# llm might not output that field if they dont have information\n",
    "\n",
    "\n",
    "@dataclass\n",
    "class SongAnalysis(adal.DataClass):\n",
    "    review: DetailedSongReview = field(\n",
    "        default=DetailedSongReview, metadata={\"desc\": \"Song review details\"}\n",
    "    )\n",
    "    duration: float = field(default=None, metadata={\"desc\": \"Duration of the song\"})\n",
    "    awards: Dict[str, int] = field(\n",
    "        default=None,\n",
    "        metadata={\"desc\": \"Dictionary of award categories and number of wins\"},\n",
    "    )\n",
    "\n",
    "    __output_fields__ = [\"review\", \"duration\", \"awards\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "id": "v3mNeyz7-MpY"
   },
   "outputs": [],
   "source": [
    "# Example template for song review\n",
    "song_review_template = r\"\"\"<SYS>\n",
    "You are a professional song critic. Analyze the given song and provide a detailed review.\n",
    "<OUTPUT_FORMAT>\n",
    "{{output_format_str}}\n",
    "</OUTPUT_FORMAT>\n",
    "</SYS>\n",
    "<USER> Review this song: {{song_title}} </USER>\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "id": "X2eifXOU-OrE"
   },
   "outputs": [],
   "source": [
    "# Create the SongReviewer component with SongAnalysis data class\n",
    "class SongReviewer(adal.Component):\n",
    "    def __init__(self, model_client: adal.ModelClient, model_kwargs: Dict):\n",
    "        super().__init__()\n",
    "        self.additional_structure_prompt = (\n",
    "            \"Dont use 'type' and 'properties' in output directly give as dict\"\n",
    "        )\n",
    "        parser = adal.DataClassParser(\n",
    "            data_class=SongAnalysis, return_data_class=False, format_type=\"json\"\n",
    "        )\n",
    "        self.generator = adal.Generator(\n",
    "            model_client=model_client,\n",
    "            model_kwargs=model_kwargs,\n",
    "            template=song_review_template,\n",
    "            prompt_kwargs={\n",
    "                \"output_format_str\": parser.get_output_format_str()\n",
    "                + self.additional_structure_prompt\n",
    "            },\n",
    "            output_processors=parser,\n",
    "        )\n",
    "\n",
    "    def call(self, song_title: str):\n",
    "        return self.generator.call({\"song_title\": song_title})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SongAnalysis: {'review': {'basic_review': {'title': 'Shape of You', 'album': 'Asylum', 'ranking': 7, 'streaming': {'Spotify': 4.5, 'YouTube': 3.2}, 'pros': ['Catchy melody', 'Unique drum pattern', 'Vocal range showcases'], 'cons': ['Repetitive lyrics', \"Some may find the song's tone too laid-back\"]}, 'cast': [{'name': 'Ed Sheeran', 'role': 'Lead vocals, songwriting'}], 'genre': ['Pop', 'Rap'], 'recommend': True}, 'duration': 3.963666666666667}\n"
     ]
    }
   ],
   "source": [
    "analysis = SongReviewer(\n",
    "    model_client=GroqAPIClient(),\n",
    "    model_kwargs={\"model\": \"llama3-8b-8192\"},\n",
    ")\n",
    "\n",
    "response = analysis(\"Shape of you\")\n",
    "print(f\"SongAnalysis: {response.data}\")\n",
    "\n",
    "# this time as we set `return_data_class` to False in the parser, we get the output as dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Song Title: Shape of You\n",
      "Album: Asylum\n",
      "Ranking: 7\n",
      "- Spotify - 4.5 million views\n",
      "- YouTube - 3.2 million views\n",
      "\n",
      "Pros:\n",
      "- Catchy melody\n",
      "- Unique drum pattern\n",
      "- Vocal range showcases\n",
      "\n",
      "Artist's:\n",
      "- Ed Sheeran as Lead vocals, songwriting\n",
      "\n",
      "Genere:  \n",
      " Pop \n",
      " Rap \n",
      "\n",
      "Duration: 3.963666666666667 minutes\n"
     ]
    }
   ],
   "source": [
    "# Access nested data\n",
    "analysis = response.data\n",
    "print(f\"Song Title: {analysis['review']['basic_review']['title']}\")\n",
    "print(f\"Album: {analysis['review']['basic_review']['album']}\")\n",
    "print(f\"Ranking: {analysis['review']['basic_review']['ranking']}\")\n",
    "\n",
    "for platform, views in analysis[\"review\"][\"basic_review\"][\"streaming\"].items():\n",
    "    print(f\"- {platform} - {views} million views\")\n",
    "print(\"\\nPros:\")\n",
    "for pro in analysis[\"review\"][\"basic_review\"][\"pros\"]:\n",
    "    print(f\"- {pro}\")\n",
    "\n",
    "print(\"\\nArtist's:\")\n",
    "for actor in analysis[\"review\"][\"cast\"]:\n",
    "    print(f\"- {actor['name']} as {actor['role']}\")\n",
    "\n",
    "if analysis[\"review\"][\"genre\"]:\n",
    "    print(\"\\nGenere:  \")\n",
    "    for genre in analysis[\"review\"][\"genre\"]:\n",
    "        print(f\" {genre} \")\n",
    "\n",
    "if analysis[\"duration\"]:\n",
    "    print(f\"\\nDuration: {analysis['duration']} minutes\")\n",
    "\n",
    "if hasattr(analysis, \"awards\") and analysis[\"awards\"]:\n",
    "    print(\"\\nAwards:\")\n",
    "    for category, count in analysis[\"awards\"].items():\n",
    "        print(f\"- {category}: {count}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "TODOs:\n",
    "1. Add `JsonOutputParser` and `YamlOutputParser` to this notebook."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BLAF5qTEmoyW"
   },
   "source": [
    "# Issues and feedback\n",
    "\n",
    "If you encounter any issues, please report them here: [GitHub Issues](https://github.com/SylphAI-Inc/LightRAG/issues).\n",
    "\n",
    "For feedback, you can use either the [GitHub discussions](https://github.com/SylphAI-Inc/LightRAG/discussions) or [Discord](https://discord.gg/ezzszrRZvT)."
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [
    "nqe-vxB1BCux",
    "NGE70aZ8BLuf"
   ],
   "provenance": []
  },
  "kernelspec": {
   "display_name": "my-project-kernel",
   "language": "python",
   "name": "my-project-kernel"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
